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Abstract. In this paper, we develop the idea to partition the edges of a weighted graph in order to uncover 
overlapping communities of its nodes. Our approach is based on the construction of different types of 
weighted line graphs, i.e. graphs whose nodes are the links of the original graph, that encapsulate differently 
the relations between the edges. Weighted line graphs are argued to provide an alternative, valuable 
representation of the system's topology, and are shown to have important applications in community 
detection, as the usual node partition of a line graph naturally leads to an edge partition of the original 
graph. This identification allows us to use traditional partitioning methods in order to address the long- 
standing problem of the detection of overlapping communities. We apply it to the analysis of different 
social and geographical networks. 
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1 Introduction 

In the last decade, the interdisciplinary field of complex 
networks has led to the development of universal tools 
in order to characterise and model systems as diverse as 
information, biological or social networks [3]. Many stud- 
ies focus on the properties of the vertices, e.g. studying 
their degree distribution or ranking them by some mea- 
sure. However graphs are both a set of vertices and a 
set of relationships between vertices — the edges. It is 
therefore useful sometimes to look at a network from the 
view point of the edges. We do this by defining 'weighted 
line graphs' for any type of graph, extending our original 
work on weighted line graphs for simple graphs |23] . Our 
weighted line graphs are topologically equivalent to the 
standard line graph of the literature |1I2I3| . However the 
weights we define play a crucial role in avoiding a bias 
inherent in unweighted line graphs towards high degree 
vertices in the original graph. Our work can be seen as 
providing a general framework to shift our view from a 
vertex centric one to an edge centric viewpoint. 

We illustrate our ideas in the context of community 
detection |5I6I7I8) . When dealing with complex networks 
one crucial step is the identification of communities or 
modules, some sort of highly connected subgraphs. It has 
been shown that many systems of interest are organised in 
a modular way and that these topological modules usually 
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correspond to functional sub-units. In a large number of 
situations, these building blocks themselves may be mod- 
ular, in which case the network is said to be hierarchical. 
Modularity at different scales has long been argued to be a 
universal property of complex systems because of the cru- 
cial evolutionary advantage it confers, by providing stable 
intermediate forms (modules) and thereby improving the 
system's adaptability 1^. Multi-scale modularity is also 
associated to a separation of time scales for the dynamics 
taking place on the graph [10lllll2fT^ . which is essen- 
tial in order to ensure the persistence of diversity in the 
system [14]. 

The fundamental idea behind most community detec- 
tion methods is to partition the nodes of the network into 
modules. By doing so, each node is therefore assigned to 
one single module. However a vertex partition has the 
disadvantage of being incompatible with the existence of 
overlapping communities, i.e. situations where nodes be- 
long to several communities. This overlap is known to be 
present at the interface between modules, but can also be 
pervasive in the whole network This is the case in 
many social networks where individuals typically belong 
to several communities defined by their type of interaction, 
e.g. work, sport buddy, family, etc, but also in biological 
networks where proteins may belong to several functional 
categories. In those situations where the interface between 
the communities occurs throughout the system, a parti- 
tion of the nodes is questionable as it imposes undesired 
constraints on the community detection problem. There 
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arc many different approaches to finding overlapping com- trisfll of a simple graph G, B(G); such that Bia is 1 if link 
munities (for example see |18ll9l20l21l22l23l24l25l26l27l28l2^q \:elated to node i, otherwise they are 0. This contains 



A popular choice is fc-clique percolation, which consists in 
looking for connected components of cliques of size k [IH] • 
However, this approach has several disadvantages as its 
outcome strongly depends on the sparsity of the network, 
it has a single integer parameter with which to set the 
scale of communities found, it is not easily implemcntable 
for weighted networks, and is not applicable to multi-scale 
networks. For instance it fails on one of the classic tests for 
community detection algorithms, the Karate club graph of 
Zachary [5Tj . 

Our approach is based on the observation that, even 
if nodes may belong to multiple groups, links often corre- 
spond to one particular type of interaction. For instance, 
in the case of social networks the connection between two 
people is usually for one dominant reason (work, sport 
interest or family). In contrast to nodes, links therefore 
typically belong to one single module. In order to ex- 
ploit this observation, we define communities as parti- 
tions of links rather than of nodes. The edges incident 
at a single node may belong to several modules and in 
this sense, nodes can be members of several communities. 
This change of perspective has several advantages. First, 
it is a very simple idea. It is perhaps surprising that we 
have few other attempts to define simple edge partitions. 
Secondly, it is a very general, fiexible framework. We sim- 
ply apply standard vertex partitioning to the weighted 
line graphs defined below. Thirdly, link partitions natu- 
rally produce overlapping communities while uncovering 
a multi-scale, hierarchical organisation. Indeed, the differ- 
ent levels of a dendrogram correspond to partitions whose 
communities are nested in each other. Uncovering edge 
partitions at different scales is therefore capable of reveal- 
ing the hierarchical, overlapping structure of a network. 
Finally, our approach can easily be generalised in order to 
analyse weighted and/or directed networks. 

This article is organised as follows. First we recall from 
|23| how to construct various useful types of line graphs 
of simple graphs, and expose the central ideas of our ap- 
proach. In the section 3, we show how to generalise the 
method to weighted graphs and how to overcome the com- 
plications which arise in this case. In section 4, we show 
some examples of how our methods work in the context 
of community detection. In section 5, we discuss possible 
generalisations of our work to the case of multigraphs and 
directed graphs. In section 6, finally, we summarise our 
findings and conclude. 



2 Simple Graphs G 
Overview 

In our approach we find it useful to start from the repre- 
sentation of a network G in terms of its incidence matrix 
B. Suppose our original simple graph G has N vertices, 
which we will label with mid-alphabet Latin characters 
and L edges which we label with early Greek al- 
phabet characters a,f3,.... We define the incidence ma- 



all the information about the graph G. For instance the 
adjacency matrix A of the graph G is given by 



(1) 



Thus the degree of a vertex is ki 

We will use the concept of random walkers on graphs to 
motivate our choice of weights in our weighted line graphs. 
In terms of the vertices of G, the usual random walk pro- 
cess is defined such that at each step the walkers move 
from their current vertex to a neighbouring one chosen 
with equal probability. Thus the density of random walk- 
ers on node i at step n is pi-n where 
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As we look at community detection on our weighted 
line graphs, it is useful to note here that the widely-used 
Newman-Girvan "modularity" Q |15| can be interpreted 
in this dynamical context [12113] . The best vertex parti- 
tion of the graph is often found by maximising Newman- 
Girvan modularity which measures if there are more edges 
within communities than would be expected on the basis 
of chance. The quality function maximised is the modu- 
larity Q wher^EI 



Q(A) 



E E 



A,. 
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Here tTj = lim„_i.oo Pi-.n is the long time distribution of 
random walkers, which is well-defined and unique if the 
dynamics is ergodic. For simple graph it is given by tt^ = 
ki/W, where W = J^ij^v^ under quite general circum- 
stances j32] . The indices i and j run over the nodes in 
community C while C is taken through the different com- 
munities of the vertex partition V. Modularity is there- 
fore equivalent to the probability of a random walker to 
remain in the same community over two successive time 
steps, minus the probability for independent walkers to 
be in those communities at those times. A partition which 
gives a large value of Q is usually taken to be a good 
community structure for the graph G. 



Random walk on the edges and weighted line graphs 

Our desire to move from a vertex centric viewpoint to 
one focussed on edges, suggest that we consider random 



^ This can be considered to be the adjacency matrix of a 
bipartite graph. This graph is a special case of what is known 
as incidence graph — the incidence of a set of hnes with a set 
of points in a Euclidean space of finite dimension. 

^ We also note that communities at different scales can be 
found by introducing a resolution parameter in the definition 
of modularity |16ll7j . 
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Fig. 1. The weighted line graph transformation emphasises the 
role of edges in the network while properly accounting for the 
degree heterogeneity present in the network. Each link in the 
original simple graph (top) corresponds to a node in the line 
graph (bottom) while nodes transform into weighted cUques. 
The "Link-Node-Link random walk" on the original graph, as 
defined in the text, is equivalent to an unbiased random walk 
on the nodes of the weighted line graph. In this illustration, 
the width of the links is proportional to their weight and the 
dotted link is transformed into the darkened node. 

walkers moving from edge to edge. On a simple graph, 
each step of such a walk has two characteristic quantities 
to consider, the degree of the vertices at each end h and 
kj. This leads naturally to two different processes 

— a random walk where the walkers can jump to all avail- 
able edges with equal probability, namely l/{ki + kj — 
2). When ki ^ kj, the walker goes with a different 
probability through i or j, and we therefore call this 
process a "link-link random walk" ; 

— a "link-node-link random walk" , where a walker first 
jumps with equal probability to one of the two nodes 
to which it is attached, say i. It then moves to a new 
link incident at i, again choosing with equal probability 
from those available. Thus with probability l/(2(fci — 
1)) it ends on one of the links leaving i and with prob- 
ability l/(2(fcj — 1)) it finishes on a new link leaving j. 
As this process is not defined for vertices of degree one 
we ignore such vertices and so the walker will always 
jump to the other vertex. 

The simplest way to shift the focus from vertices to 
edges is to construct the other product from the rectan- 
gular incidence matrix B. Thus we define the line graph 
L{G) through its L x L adjacency matrix C: 

Cap = ^ BiaBip{l — Sap)- (4) 
i 

The line graph is a well known construction jll2l3| that 
almost perfectly encodes the topological properties of the 
original graph. The structure of G can be recovered com- 
pletely from its line graph L{G), for almost any graph ex- 



cept for a triangle or a star network of four nodes [T] . The 
vertices of the line graph are in one-to-one correspondence 
with the edges of the original graph G, except for the edges 
of leaves (i.e. edges which end in a degree one vertex). A 
vertex in the original graph of degree k is mapped into 
k{k — l)/2 edges of the line graph. 

If we now perform the usual vertex random walk on 
the vertices of the line graph G{G) we see that this corre- 
sponds to 

Pa-n+l = ^^^Pl3;n- (5) 

p 

where fca = Ca/3 = {ki kj — 2) and i and j are the 
vertices at the end of edge a in the original graph G. Con- 
sequently, we observe that the usual random walk on the 
vertices of this line graph C{G) corresponds to a "link-link 
random walk" on the edges a of G. It is interesting to note 
that this type of line graph has found many applications 
in recent years, see for instance [3 3 34,35..36-37..38.39il40) . 
However, its big drawback is that each vertex i in the orig- 
inal graph G contributes k{k — l)/2 edges to G(G) even 
though its importance in the original graph could be esti- 
mated to be just k. That is the large degree vertices, the 
hubs, are given too much prominence in the line graph 

[2125]. 

The solution suggested in [53] is to define a new type of 
line graph, the weighted line graph D{G) with adjacency 
matrix 

i,fci>l * 

In the context of projecting bipartite networks this is a 
well known weighting j41j . If we consider the usual vertex 
random walk on this line graph D{G), so 

Pa;n+1 = E ^Z';™ (7) 

p 

then we see that this is equivalent to a link-node-link ran- 
dom walk on the original graph G, see Fig[Tj3. 

Central idea 

At the heart of our approach is the construction of a line 
graph in order to represent the system from an edge cen- 
tric viewpoint. As we have shown in the previous section, 
there exist different ways to project the incidence matrix 
onto a line graph, and each projection is associated to 
a different dynamics taking place on the edges, i.e., to 
a different interpretation of what the relations between 
edges are. As we will see in the next section, the num- 
ber of ways to construct a line graph, when the original 
graph is weighted, is even larger. The selection of a sen- 
sible projection is therefore an essential ingredient, which 
may in principle depend on the system under scrutiny but 
should in any case avoid biasing the representation of the 
network, for instance by giving too much importance to 
certain nodes. This is the reason why D{G) is preferred 
to G(G) when analysing simple graphs 
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Fig. 2. When applied to the weighted but undirected network 
on the left (width of the links is proportional to their weight in 
this illustration) , the weighted line graph transformation leads 
to the weighted and directed network shown on the right. In 
this example, the dotted link is transformed into the darkened 
node. 

3 Undirected Weighted Graphs G 

Suppose now wc have an undirected but weighted graph 
G. The incidence matrix may be defined as before to be 
Bia = 1 if edge a is incident to vertex i with all other 
entries in these rectangular incident matrices are zero. To 
record the weights of the edges it is useful to define a 
second weighted incidence matrix B as 

Baj = Wa (8) 

where edge a is incident on vertex j and has weight Wa. 
Each vertex then has degree ki and strength Si given by 

j a i a 

The adjacency matrix of the original graph G is then 

Aij = E BiaBja ^ ^ Wa, (10) 
a = {i,j) a = {i,j) 

where a = indicates that then sum is taken over 

all edges from vertex j to i. This matrix is symmetric as 
required. 

If we wish to use the weight information of G, the log- 
ical generalisation of the definitions for C for unweighted 
graphs G of [55] is as followf0: 

i 

This definition for the adjacency matrix of a line graph 
mimics our construction of the adjacency matrix A of the 
graph G in (fTU|) which also used both B and B. However, 
even if the original graph G is undirected, this adjacency 
matrix is not symmetric, i.e., the line graph G{G) is di- 
rected. If we think in terms of random walks from edge 
/3 to vertex i and then to edge a then it is natural that 

^ If we ignore the weights completely then we get a line graph 
which is the traditional unweighted one, L{G). This would be 
defined using only B as Lq^ =^^BaiBip{l — Sc,p). This repre- 
sentation only records the topological information of the orig- 
inal graph. 



the edge weights are linked to the stubs leaving vertex i, 
hence the use of B in ([TT|) . The probability of moving to an 
adjacent edge is proportional to the target edge's weight 
Wa but is independent of the current edge's weight wp. 

The problem with the definition of C in ([TT|) is that 
even though it involves the weights of the edges through 
B, a vertex of strength s in graph G is going to contribute 
0{ks) to the total weight of these line graphs, which seems 
like over counting. High degree, high strength vertices are 
too prominent. The solution is to reduce the weight of 
assigned to each link in the weighted line graph by 0{s~^). 
Thus we consider the adjacency matrix 

EaP= E B,p{l-Sap). (12) 

This is also a more natural definition when we consider 
the dynamics of a random walker moving from edge /3 
to vertex i and then to edge a. The first step is to each 
end of the edge /? with equal probability {Bip term) while 
the latter step to arrive at edge a is taken in proportion 
to the weights of the edges at i {Bai term). There ex- 
ist many other ways to project the incidence graph B(G) 
onto a weighted line graplQ but this definition is the one 
which preserves the dynamics of random walkers. The dy- 
namics of random walkers is important in many contexts 
of graph theory, such as in the PageRank algorithm or in 
the context of Newman-Girvan modularity Q ([3]) as noted 
above. 

When the original graph G is unweighted and undi- 
rected then this weighted line graph E{G) reduces to the 
weighted line graph described in |23| . However if the orig- 
inal graph G is weighted then the weighted line graph 
E{G) will be both directed and weighted. One special case 
is when the original graph G is ergodic in which case so is 
this weighted line graph E{G). 

4 Applications 

Once the projection from a weighted graph G to the weighted 
line graph E{G) to has been made, it is possible to 
use any vertex metric on the line graph in order to char- 
acterise the structure of the edge sin the original graph. 
It is for instance possible to look at the centrality or the 
clustering coefficient of the nodes of the line graph in or- 
der to uncover the role of the original edges. A study of 
the degree distribution in the line graph is sensitive to 
degree-degree correlations of neighbouring vertices in the 
original graph. 

Here though we will focus on the vertex partition of 
the weighted line graph E{G) (fT2)) in order to produce 
an edge partition of the original graph G. In principle, 
any vertex partitioning scheme can be used. However since 
optimisation of modularity is related to the behaviour of 

* Other interesting generalisations include 

Dcp = I]i,fci>i 7f^'^'/3(l ~ "^"/s) ^^'^ ^ 
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random walkers on a graph and our construction of E{G) 
preserves the dynamics of random walkers, it makes sense 
to apply the modularity optimisation approach to find the 
partitions of the weighted line graph E{G) (fT^ . So we will 
search for maxima of 



Q(E) = E E 



(13) 



where the out-strength is s^"""'"' = J^fs^a/s- The vector 
TTjs is the dominant eigenvector of the transition matrix 
(-E'c(/3/s^°"''') with eigenvalue one, normalised such that 
tTq, = 1. Let us emphasise that a weighted but undi- 
rected graph G produces a weighted line graph E{G) which 
is also directed, so that the equilibrium walker distribu- 
tion tTq is non-trivial. This has to be computed first, which 
we do by using the power method j43]. 

Maxima of (5(E) can rarely be found exactly but 
there are many good approximate algorithms. For our own 
convenience we use the Louvain algorithm of |42j to find a 
partition of the vertices of E{G) which gives a large value 
of modularity Q(E). 



weight of edges is the number of chapters in which that 
pair of characters has appeared together. The results of 
performing a vertex partition on the line graph E{G) are 
shown in Fig|31 The result is generally compatible with 
the vertex partition found in |15j and presumably reflect 
the natural communities that a narrative structure will 
produce in many novels and plays. However the main ad- 
vantage our edge colouring approach is that characters, es- 
pecially the main ones, will belong to several communities, 
as indicated by the different coloured edges. In particular 
the main protagonist, the vertex labelled Valjean in Figl21 
is connected to all but one community but the strength of 
his connection to each community varies significantly as 
Table [T] shows. 



Community 


Valjean Membership 


Myriel 


7% 


Marius 


38% 


Fantine 


6% 


Thenardier 


15% 


Javert 


22% 


Judge 


9% 


Enroljas 


4% 



Literary Characters Coappearance 




> Montpamisse \ 

"TV jfV^jt/^ 

Fig. 3. Part of the graph of characters in Les Miserables, cen- 
tred on the main character Valjean. Characters are linked by 
an edge if they appear in the same scene and the weight is 
equal to the number of chapters in which they both appear 
[44j . The edge colours reflect a partition which produces an 
approximate maximal value of Q{E). This method allows ver- 
tices to be a member of many communities, appropriate for 
many characters such as Valjean shown here. 



Our first example of a weighted graph is based on the 
appearances of characters in the same chapter of Les Mis- 
erables [H]. The vertices are different characters and the 



Table 1. Table showing the fraction of edge weight incident 
at the Valjean vertex in the communities found by optimising 
the modularity Q{E) of p3|) . Communities are labelled by the 
character (other than Valjean) with the largest weight of edges 
in that community. 



Clustering Non ISIegative Matrices 

It is common to come across dense matrices with non- 
negative entries. One will often be interested in reducing 
the dimension of the space by looking for clusters of en- 
tries which are similar in some sense. By converting these 
matrices into a sparse graph, the problem becomes equiv- 
alent to the search for communities in networks. 

We illustrate our approach with an example of geo- 
graphical separation of sites. We consider a set of 33 im- 
portant Middle Bronze Age sites in the Aegean (c. 2000BC- 
1400BC) taken from |45I46] . In the corresponding graph, 
the sites are vertices and edges arc given a weight which 
is a monotonically decreasing function of the distance be- 
tween two sites. Finally to produce a sparse graph a thresh- 
old is used and any edge with weight below this value is 
removed. The edge partition of this graph found by op- 
timising the modularity of the line graph E{G) is shown 
in FigUl This produces five communities: Asia Minor and 
the Dodecanese (Miletus), the Cyclades (Naxos), Eastern 
Crete (Palaikastro), Central and Western Crete (Knossos) 
and a small group centred on Attica (Aegina). A vertex 
partition might well uncover similar groups but it would 
not emphasise that some sites may have a more complex 
relationship to the main groups. For instance, Akrotiri 
on modern Santorini in the Cyclades is part of both the 
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Fig. 4. The edge partition of a graph of Middle Bronze Age 
sites in the Aegean. The weight of an edge is 6'((1 + (a;)'')"^ — 
0.220) where d is the distance in 100 kilometres between two 
sites. 100km is roughly the distance one could travel in a day. 
The distances have been estimated using the shortest route 
where land travel is weighted by a factor of 3.0 while sea travel 
is weighted by 1.0 [JS]. The threshold of 0.220 is chosen such 
that 33 of the 34 sites form a connected graph. The edge colours 
reflect a partition which produces an approximate maximal 
value of Q{E). 

Cycladcan and a Cretan community. This emphasises the 
role it may have played in the both in expansion of Minoan 
influence during this era, and in its demise following the 
destruction of Akrotiri in the eruption of ancient Thcra 
(Santorini is the modern remnant). Another way to see the 
usefulness of this type of approach is to compare against a 
more traditional dendrogram analysis of the distance ma- 
trix, such as shown in Fig [5] For instance the special role 
of Akotriri is not apparent in the dendrogram of Fig [5] 



Academic Coauthorship 

In Figiniwe show part of the weighted graph representing 
the coauthorships of scientists on some network papers, 
as defined by Newman [?7]. The edges are partitioned by 
searching for a large Q{E). Here we find that some of the 
most productive scientists are the focus of one community, 
and they participate in other communities much less often. 
The links between these groups are often provided by less 
prominent researchers, reminding one of the strength of 
weak links hypothesis of Granovetter [i^. For instance 
in Fig |6] Barabasi is the centre of one main community 
though a few edges incident at the Barabasi vertex are 
also part of two other communities. 



;hted Networks for Overlapping Communities 



o 
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Fig. 5. A dendrogram derived from the matrix of distances 
between 33 key sites of the Middle Bronze Age in the Aegean. 
The horizontal lines indicate the average distance between the 
groups of sites indicated by the vertical lines below that hori- 
zontal line. 




Fig. 6. Part of the coauthorship network of scientists, as de- 
fined by Newman [IT]- Each paper of k authors contributes a 
weight of (fc — 1)~^ to an edge between each of the k{k — l)/2 
pairs of collaborators. The edge colours reflect a partition 
which produces an approximate maximal value of Q{E). 
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5 Possible generalisations 

In this paper, we have focused on hne graphs without self- 
loops. However there are natural alternatives to our defini- 
tions which include self- loops in the line graphs [13] . Their 
adjacency matrices take the form B^iBip jvi where ob- 
vious choices for Vi are 1, the degree fci, the strength or 
the product (fc^Si) which are the analogues of C(G) ([TT|) . 
D(G), E(G) (ini), and F(G) respectively. One advantage 
of these line graphs have over our previous definitions is 
that all connected vertices are explicitly represented in 
these graphs. The presence of self loops corresponds to 
allowing random walkers to move first to cither vertex at 
the ends of an undirected edge, but then being allowed 
to come back to finish on the same edge it started from. 
Whether this type of random walk and these line graphs 
are a better way of studying the graph G will depend on 
the context. Interestingly, in the context of community 
detection, adding self loops is a technique used to alter 
the resolution of algorithms [T7]. Thus it may be that for 
community detection there is little difference in practice 
if one also alters the number of communities found by an 
algorithm e.g by altering modularity |16ll7j . 

Our formalism can also be generalised to situations 
when the original graphs G have self-loops or multiple 
edges between vertices, which has not been considered 
so far. Indeed, self-loops and multiple edges arc correctly 
encoded in the incidence matrix representation B(G) of 
([8]). The presence of self- loops requires some adaptation 
of our formulae but multigraphs are included without any 
change. A multigraph representation could have interest- 
ing consequences, as it could allow edges to be a member 
of several different communities. In this case the original 
edge is split into several edges whose total weight is equal 
to that of the original edge. In social networks this means 
the relationship between two individuals can be of more 
than one type, e.g. two work colleagues may also share the 
same hobby. 

Finally our results can be generalised to cases where 
the original graph itself is directed. To do so, we propose to 
look at the unweighted incidence matrix B in terms of the 
incoming edges, that is Bia = 1 if edge a goes into vertex 
i. The weighted incidence matrix B would be defined in 
terms of the source vertex of an edge and its weight, so 
Baj = Wa if edge a of weight Wa is leaving vertex j . The 
adjacency matrix of G is then 

Aij = BiaBgj, (14) 

a 

while the adjacency matrices of the line graphs are given 

by 

where Vi can be 1 for C{G), ki — J2a^i^ai) for D{G), 
Si = Xia-^ai for E{G), or {kiSi) for F{G). It is inter- 
esting to note that a random walker performing a link- 
node-link random walk on the original graph G (see Fig 



[TJ3) now corresponds to exactly the same process as the 
usual vertex random walk on the original graph. This was 
not the case when dealing with undirected graphs, as the 
sequence a — i — /3 — i — ais legitimate in terms of the link- 
nodc-link random walks on G, while it is not legitimate 
for a traditional vertex random walks, i.e. the single step 
i — /3 — i is not allowed in the usual vertex walk process 
on G. With directed graphs G (assuming no self-loops) 
no edge can have the same source and target vertices so 
such a sequence never appears. In other words, the mod- 
ularity for line graphs D{G), E{G) and F{G) defined for 
directed graphs are identical. If this is advantageous one 
can always choose to represent an undirected graph as a 
directed graph to obtain these benefits. However, it is not 
clear if these small differences between the random walks 
implicit in the construction of the line graphs will pro- 
duce any significant differences in the analysis of a given 
network. 



6 Conclusion 

In this paper, we have extended our work on line graphs 
from unweighted f23| to weighted graphs. We have shown 
that this generalisation leads to the construction of line 
graphs which are both weighted and directed. The goal 
of this simple and natural procedure is to move the focus 
from vertices to edges in the original graph for any graph 
based problem. 

To illustrate this general principle we have used our 
weighted line graphs in the context of community detec- 
tion. The most popular schemes consist in partitioning 
the vertices of the graph, namely in assigning each ver- 
tex to a unique community. Unfortunately, this approach 
is known to be inadequate in the many systems where 
vertices naturally belong to several communities. This is 
the case of social networks for instance, where individuals 
(vertices) may be a member of several different commu- 
nities characterised by different types of relationship, e.g. 
family ties, a shared hobby interest, or work connection. 
An edge partition is particularity well adapted to such 
situations, as it naturally produces overlapping communi- 
ties, while preserving the sound mathematical foundations 
of graph partitioning theory. Our approach has the addi- 
tional advantage to be easily implementable as the con- 
struction of a line graph is straightforward and the vertex 
partitioning of the line graph by any standard algorithm 
directly produces the optimal edge partition of the original 
graph. The cost in terms of computer memory and time is 
roughly 0{{k^)/{k)) (the ratio of edges in the line graph 
to the original graph), while the human cost in terms of 
code development is minimaQ 

R.L. acknowledges support from the UK EPSRC. 

^ Codes to construct weighted line graphs and opti- 
mise modularity are freely available for download on the 
webpages http://sites.google.com/site/linegraphs/ and 
http : //sites . google . com/ site/f indcommunities/. 
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